data: https://www.kaggle.com/divyansh22/summer-olympics-medals
 

The Summer Olympic Medals dataset evaluated within this report is a fairly large dataset consisting of over 15,300 observations, and 14 variables/features (most of which are categorical in nature). Each observation represents a single medal won at the Summer Olympic Games from 1976 to 2008. Some of the variables included are the year, sport, discipline, event, athlete name, gender, medal type (gold, silver, or bronze), and the country represented. A variety of visualization techniques are explored within this report in order to appropriately display different aspects of the data. Some of these techniques include a treemap, a stacked bar chart, a choropleth, a small multiples dodged bar graph, as well as an animated plot. Each technique was chosen to help illustrate a specific aspect of the data, as well as convey a compelling story as a whole relating to this particular dataset.
 

The story begins by analyzing the breakdown of sports by discipline and number of medalists through the use of a treemap. This effectively illustrates which Summer Olympic sports were the most popular in terms of the overall number of Olympic medalists per sport. Next, the number of individual medals won (gold, silver, and bronze) for each of the top ten athletes is evaluated through a stacked bar chart. A choropleth displaying the total number of medals won for all countries is then interpreted to gain a solid understanding of which nations have historically been the most successful at the Summer Olympics. Finally, the counts of each specific medal type (gold, silver, and bronze) are compared over time for the top two countries via a small multiples dodged bar graph. Additionally, an animated plot illustrating the total number of medals won over time for the top six countries is included at the end in order to provide a high level comparison of a handful of the top performing nations.
 

The above treemap illustrates the breakdown of sports by discipline and number of medal winning athletes for all sports held at the Summer Olympic Games from 1976 to 2008. The larger and darker each square/rectangle on this treemap, the greater the number of Summer Olympic medalists that participated in that particular sport/discipline. This graph is informative because it gives a general idea about the popularity of each sport in terms of the total number of Olympic athletes that received medals in it over the course of nine Summer Olympics. From this plot, it is apparent that the largest/most popular sports were Athletics, Aquatics – swimming, and Rowing (approximately 1,500 medalists each). Some of the less popular sports were Tennis, Badminton, and Modern Pentathlon (less than 500 medalists).
 

In this visualization, both color and area were mapped to the number of medalists. A continuous color scale from light blue to dark blue was utilized, and the text color of red was selected due to the fact that it is located on the opposite side of the color wheel as blue (and thus does not easily blend together). The white text also stands out, but was made slightly transparent through the adjustment of its opacity level in R. This visualization was refined through the drafting process by reversing the transition of the continuous color scale, which initially defaulted to scale from dark (low value) to light (high value), instead of light to dark.
 

The above stacked bar graph shows the number of medals won for the top ten athletes at the Summer Olympics from 1976 to 2008, broken down by specific medal type (gold, silver, and bronze). One theme that begins to emerge with this visualization is that of the top ten athletes who competed in the Summer Olympics over approximately three decades, the majority were represented by just two countries, the United States and Russia. From this graph, we can also see that Michael Phelps, the famous American swimmer, won the most Summer Olympic medals (14 gold, and 2 bronze) between 1976 and 2008. The previous treemap illustrated that Aquatics – swimming was one of the most popular Summer Olympic sports/disciplines. Aquatics also happens to be the sport with the greatest number of events. Thus, it intuitively makes a great deal of sense that the top medal winning athlete might be a swimmer.
 

In this visualization, color was appropriately mapped to the specific type of medal won (gold, silver, or bronze). The plot axes were flipped in order to allow for the names of the individual athletes to be more easily read, the y-axis tick marks were removed, and the major & minor grid lines were set to dashed and grey. Additionally, through the drafting process, the ggflags library in R was employed to create the round flag glyphs illustrating which country each athlete represents.
 

The above choropleth illustrates geographically the total number of medals won for each country in the Summer Olympics from 1976 to 2008. From this visualization, it is overwhelmingly apparent that two of the most successful countries in the Summer Olympics over the course of this 32-year timespan were the United States and Russia (winning nearly 2,000 medals apiece). Germany too, appears to have won a substantial number of medals. Australia, China as well as many of the European nations also look to have done considerably well in terms of total medal counts. Nonetheless, the emergent theme of this particular graph seems to be the dominant success of the United States and Russia (as well as Germany).
 

This visualization was created by joining the world map data (with latitude and longitude coordinates) in R to the Summer Olympic Medals dataset by country. A continuous color scale from ivory to brown was selected. The color was mapped to the total number of medals won per country. Initially, there were a few countries with missing data. However, through the drafting process these were set to a default color of ivory, in order to preserve the overall effect of the visualization. Finally, the legend was repositioned to the lower left corner of the graph.
 

The small multiples dodged bar chart shown above compares the individual number of bronze, silver, and gold medals won by Russia and the United States in each Summer Olympic Games from 1976 to 2008. Since the United States and Russia were the two most successful nations in terms of overall medals won at the Summer Olympics during the timeframe under analysis, comparing the individual medal counts of these two countries over time was warranted. From this visualization, it is apparent that the United States won more gold medals than any other medal type in every single Summer Olympic year, while Russia’s top winning medal types are a bit more mixed. Additionally, it is evident that Russia’s best Summer Olympic year in terms of medals won was 1980. Coincidentally, this also happens to be the year that the United States boycotted the Summer Olympic Games. Likewise, Russia did not compete in the following Summer Olympics in 1984.
 

This visualization was constructed in R markdown by using a facet wrap to display the two countries side-by-side for comparison purposes. The individual medals were dodged to display the three different medal types (bronze, silver, and gold) alongside one another instead of being stacked. Finally, color was again appropriately mapped to the specific medal type. Through the drafting process, it was discovered that using a facet wrap was more effective than a facet grid for comparison purposes of the two countries in this particular small multiples graph.
 

Total Medals Won Over Time For Top Six Countries in Summer Olympic Games (1976-2008)

The above animated visualization shows the total medals won over time for the top six countries in the Summer Olympic Games from 1976 to 2008. On the left is a bar chart which aggregates the number of medals won for each nation over time, and on the right is a line plot that shows the cumulative medals won over time for these same six countries. From this visualization, it is evident that the United States, Russia, and Germany were very competitive with one another, outperforming all other nations by a considerable margin. The three next best countries in terms of total medals won over time were Australia, China, and Italy. This visualization also illustrates that China did not participate in either the 1976 or 1980 Summer Olympics. Consequently, their medal tally begins later than the five other countries shown.
 

This visualization was constructed in R markdown by first creating a rolling total of all medals won for each of the top six countries over the course of nine Summer Olympic Games. Both the animated bar chart and animated line plot used a discrete pastel color palette from the color brewer library that was mapped to the individual countries, and a dark theme/background to tie the two graphs together visually. The axes of the bar chart were flipped in order to allow for the country names to be more easily read along the y-axis, and the country names were added at the end of each line in the line graph via the direct labels library. Finally, both animations were saved as separate gifs, and then iteratively stitched together, frame by frame, in order to create a new, combined gif displaying both animated plots side-by-side as a single visualization.
 

Given more time, it would have been very insightful to evaluate the population of each country over time, and how this relates to the total number of medals won for each respective nation. The dataset did not contain this information. However, Germany for example has a current population of just about 84 million people, while the United States’ population is approximately 331 million. Consequently, it is very impressive that Germany remained so competitive with the United States in terms of total medal counts given that their population is only a small fraction of ours.